Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate crop config creator node #191

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

aljazkonec1
Copy link
Contributor

Purpose

This PR migrates the ProcessDetections node, renamed to CropConfigCreatorNode, from depthai-experiments to depthai-nodes. The node simplifies the pipeline by removing one extra script node and expands the usability as it unblocks RVC2 usage. The naming of the node and any other suggestions are kindly welcome!

Specification

Add a new node, called CropConfigCreatorNode. This node is used to create a dai.ImageManipConfigV2() object for every detection. The node works on both RVC2 and RVC4. Simultaneously, this node also addresses the issue with RVC2 with pool size that has been blocking depthai-experiments.

Dependencies & Potential Impact

No breaking changes outright, but an update to depthai-experiments will be needed. One risk is the way the node handles syncing of which crop config is for which frame. The node assumes the last frame is saved in the ImgManipV2 node and when recieving a detection_message it sends an empty crop config that skips the frame and loads the next frame in the queue. That frame should be the one from which the detection_message was gotten.

Deployment Plan

None / not applicable

Testing & Validation

Tested on age-gender experiment on this branch on both RVC2 and 4 for 0 to 6 faces present in the frame. No unsynced frames could be detected. PS: pay attention to how the node is used in main.py and how the following ImageManipNode (script_node) has to be set up in order for the correct sequence.

@codecov-commenter
Copy link

codecov-commenter commented Mar 17, 2025

Codecov Report

Attention: Patch coverage is 97.97980% with 2 lines in your changes missing coverage. Please review.

Project coverage is 48.68%. Comparing base (91bd184) to head (ec7ce9e).

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
depthai_nodes/node/host_crop_config_creator.py 97.95% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##           main     #191       +/-   ##
=========================================
+ Coverage      0   48.68%   +48.68%     
=========================================
  Files         0       76       +76     
  Lines         0     4404     +4404     
=========================================
+ Hits          0     2144     +2144     
- Misses        0     2260     +2260     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@kkeroo kkeroo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise. Please also add unit test for the node.

setReusePreviousImage flag set to False, which changes the previous frame for a
new one.
"""
assert isinstance(detections_input, ImgDetectionsExtended)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add here ... or isinstance(detections_input, dai.ImgDetection)? Might come handy if e.g. YOLO is the first model of the pipeline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would preffer not to add it, because then the eintire node would have to be split into if ImgDetectionsExtended: ... elif dai.ImgDetections: .... I would rather have YOLO models be updated to ImgDetectionsExtended instead of using ImgDetections.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that (sadly) update to dai YOLO parser using ImgDetectionsExtended is quite a bit of a way out on the roadmap. So I would agree that algthough a bit PITA we should make the node work with both ImgDetections and ImgDetectionsExtended

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I'll add support for ImgDetections but the more we account for dai.ImgDetections the later we will migrate YOLO parser as it wont be so "critical"


for i in range(num_detections):
cfg = dai.ImageManipConfigV2()
detection: ImgDetectionExtended = detections[i]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the detections sorted by confidence? Because if not, we might not send out the desired ones (e.g. if a model makes 300 detections, and we limit them to 100, then it would make sense to output the top 100 most confident ones).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The detections are sorted by which ever parser created the ImgDetectionsExtended message. I thought about adding a sorting, but it seems a bit out place given this is just a config creator node. You dont know if the user does any other types of syncing with the detections and sorting them could completely throw off the sync

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I would leave any sorting out fo this node as it should only do the crop config

Copy link
Collaborator

@jkbmrz jkbmrz Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving it a second thought, why do we even want to limit the amount of detections here? We are planning to also add a DetectionsFilter node that will filter the provided detections (e.g. by class, confidence, etc.) and we can maybe also add a number limitation there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it as like an extra buffer to limit amount of outgoing messages in case the models are large or the device struggles processing all detections or the visualization is crammed with visuals. I wasn't aware about the DetectionsFilter node, so when its implemented I can remove n_detections?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove already, no need to wait for DetectionsFilter node and then do a change in the implementation of this one.

Copy link
Collaborator

@klemen1999 klemen1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add unittests

@kkeroo
Copy link
Collaborator

kkeroo commented Mar 24, 2025

@aljazkonec1 can you also adjust the unit test so it could be used in stability tests as well? The adjustments are simple.

  1. You need to add fixture duration like here.
  2. Add duration parameter to each test function like here
  3. Add a check in each test function if duration is set and then perform the neccessary checks and validation inside if statement like here.

This would be then compatible with stability tests and the workflow could take this node also into the stability tests pipeline.

You can check if it works by running pytest --duration 10 in the tests folder to run the tests with 10 second duration.

Copy link
Collaborator

@jkbmrz jkbmrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM but yet adding two small comments.

from depthai_nodes import ImgDetectionExtended, ImgDetectionsExtended


class CropConfigsCreatorNode(dai.node.HostNode):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename the node from CropConfigsCreatorNode to CropConfigsCreator? It would better follow the naming of other nodes as none of them have “Node” in their name (they are "nodes" by default).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the new name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I'll change it.

target_h : int
The height of the target image.
n_detections : int
The number of detections to keep.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put the default value in here as well

will keep at most the first 100 detections.

Before use, the source and target image sizes need to be set with the build function.
The node assumes the last frame is saved in the dai.ImgManipV2 node and when recieving a detection_message the node sends an empty crop config that skips the frame and loads the next frame in the queue.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this assumption? How can we get rid of it?
Wouldn't it be better if this node also accepted the image and then it could make sure the crops are always correctly created?

Copy link
Contributor Author

@aljazkonec1 aljazkonec1 Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem is that this is a host node and is run on host. Recieving a frame from device and then sending n frames back to the device completely fills the network and slows the pipeline considerably. And we cant make this a script node because there are no ImgDetectionsExtended on the devices (yet).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, i see. this is going to be tough to make the host nodes work in both the peripheral and local modes.
I remember there was a ImageManip config, something like wait_for_image and wait_for_config. wait_for_config ment that if the manipulation has an image ready, it has to wait for a manip cfg before processing it. Mabe if this is still the case in depthaiV3 requestion the manipulation to be configured like this would be better dependency then doing this workaround which will sometimes fail?

The output link for the ImageManipConfigV2 messages
detections_output : dai.Output
The output link for the ImgDetectionsExtended message
w : int

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why cannot we work with relative coordinates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there is a rotated rectangle, cropping in relative coordinates creates a sheared img crop in the original coordinates due to the aspect ratio in relative coordinates being 1:1 while in absolute coordinates the aspect ratio can be anything (16:9, 3:2, etc.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it a bug in depthai? I don't see why it matters if the bbox is rotated or not.

The height of the source image.
target_w : int
The width of the target image.
target_h : int

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what does this do. I can see it's used here: cfg.setOutputSize(self.target_w, self.target_h)
But I don't understand it's purpose. Is this a command to resize the crop? If so, what if I don't want to resize it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because the size of the detected rectanlge can be anything, this is so that the output of crop node is always the same so the cropped images can be used in a second NN for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you mention it, I doo see that its an assumption that someone will want the same size outputs. I can change it to be an option (eg. equalize_crop_size = True). Thoughts @klemen1999

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the minemaestro app, we were generating crops based on bboxes and we did not resize, unless the resulting crop was then a threshold. So for me, a general node would have these options:

  1. crop and keep the resulting size
  2. crop and set a fixed size. with this, we would need the options to set different resizing options like: stratch or letterbox
  3. crop and resize if resulting size is bigger then threshold

cfg.setSequenceNum(sequence_num)
send_status = False
while not send_status:
send_status = self.config_output.trySend(cfg)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, seems like a hack which will break in some situations and then its going to be a mess. Or does it work every time?
Also, if this while loop is needed, maybe add some small wait to prevent spamming the trySend?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trySend already waits for connected outputs to recieve the message before returning the boolean so there is no spamming of messages. In theory this means there should be no miss-sync (at least not due to this while loop)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. from the dai's tryGet, which doesn't wait, I was under the impression that this would immediately return False in case the message cannot be sent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants